127 research outputs found

    Pragmatic and Cultural Considerations for Deception Detection in Asian Languages

    Get PDF
    In hopes of sparking a discussion, I argue for much needed research on automated deception detection in Asian languages. The task of discerning truthful texts from deceptive ones is challenging, but a logical sequel to opinion mining. I suggest that applied computational linguists pursue broader interdisciplinary research on cultural differences and pragmatic use of language in Asian cultures, before turning to detection methods based on a primarily Western (English-centric) worldview. Deception is fundamentally human, but how do various cultures interpret and judge deceptive behavior

    Deception Detection and Rumor Debunking for Social Media

    Get PDF
    Abstract The main premise of this chapter is that the time is ripe for more extensive research and development of social media tools that filter out intentionally deceptive information such as deceptive memes, rumors and hoaxes, fake news or other fake posts, tweets and fraudulent profiles. Social media users’ awareness of intentional manipulation of online content appears to be relatively low, while the reliance on unverified information (often obtained from strangers) is at an all-time high. I argue there is need for content verification, systematic fact-checking and filtering of social media streams. This literature survey provides a background for understanding current automated deception detection research, rumor debunking, and broader content verification methodologies, suggests a path towards hybrid technologies, and explains why the development and adoption of such tools might still be a significant challenge

    Deception Detection & Rumor Debunking

    Get PDF
    (1) Deception Detection and (2) Rumor Debunking, as the title suggests,and I will argue for the need of hybrid methods (in a combination of the two). My main goal here is to point researchers interested in social media research towards these 2 exciting fields. I predict that such technologies (with more R&D, as they mature) will become indispensable in our attention-economy. Content producers are rushed to be first in the news stream, and social media consumers simply don’t have time or energy to verify content that is pushed at them

    Perceptions of Clickbait: A Q-Methodology Approach

    Get PDF
    Clickbait is “content whose main purpose is to attract attention and encourage visitors to click on a link to a particular web page” (“clickbait,” n.d.). The term is also generally used to refer specifically to the attention-grabbing headlines. Critics of clickbait argue that clickbait is shallow, misleading, and ubiquitous – “a new word that has become synonymous with online journalism” (Frampton, 2015). It is the subject of a small, but growing number of studies in disciplines ranging from linguistics, communications, and information sciences. Palau-Sampio (2016) analyzed linguistic strategies associated with tabloid journalism in the Spanish digital newspaper Elpais.com, concluding that there is a trend towards lower quality news reporting. In their research on Danish news sites, Blom & Hansen (2015) identified forward-referencing, specifically the use of empty pronouns to create an information gap, as a feature of clickbait headlines. Chen, Conroy & Rubin (2015) proposed that automatic identification of clickbait could draw upon three types of features: a) lexico-semantic and pragmatic linguistic patterns (e.g. unresolved pronouns, affective and suspenseful language, action words, overuse of numerals, and reverse narratives), b) incongruent image placement with a possible emotional load, and c) user reading and commenting behavior. An effort in automated identification of clickbait by Potthast, et al. (2016) achieved 79% accuracy on Twitter tweets. But debate still rages over what the word actually means (Gardiner, 2015)

    Veracity Roadmap: Is Big Data Objective, Truthful and Credible?

    Get PDF
    This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap for theoretical and empirical definitions of veracity along with its practical implications. We explore veracity across three main dimensions: 1) objectivity/subjectivity, 2) truthfulness/deception, 3) credibility/implausibility – and propose to operationalize each of these dimensions with either existing computational tools or potential ones, relevant particularly to textual data analytics. We combine the measures of veracity dimensions into one composite index – the big data veracity index. This newly developed veracity index provides a useful way of assessing systematic variations in big data quality across datasets with textual information. The paper contributes to the big data research by categorizing the range of existing tools to measure the suggested dimensions, and to Library and Information Science (LIS) by proposing to account for heterogeneity of diverse big data, and to identify information quality dimensions important for each big data type

    Comparative Stylistic Fanfiction Analysis: Popular and Unpopular Fics across Eleven Fandoms

    Get PDF
    Abstract: This study analyses 545 sample fanfiction stories (fics) in their stylistic feature variation by popularity and across eleven ‘fandoms’ in creative writing forums. Lexical richness, average sentence and paragraph lengths are isolated as promising measures for a text classifier to use in predicting a fic’s likely popularity in its fandom. RĂ©sumĂ©: Cette Ă©tude analyse un Ă©chantillon de 545 chapitres d‘Ɠuvres de fanfiction (fics) selon leur variation stylistique et leur popularitĂ© dans onze ‘fandoms’ diffĂ©rents. La richesse lexicale, longueur moyenne de phrase et longueur moyenne de paragraphe ont Ă©tĂ© choisis comme traits stylistiques propres Ă  diffĂ©rencier les fics populaires des fics impopulaires

    Differences Over Discourse Structure Differences: A Reply to Urquhart and Urquhart

    Get PDF
    Purpose – In this paper we respond to Urquhart and Urquhart’s critique of our previous work entitled “Discourse structure differences in lay and professional health communication”, published in this journal in 2012 (Vol. 68 No. 6, pp.826 – 851, doi: 10.1108/00220411211277064). Design/methodology/approach – We examine Urquhart and Urquhart’s critique and provide responses to their concerns and cautionary remarks against cross-disciplinary contributions. We reiterate our central claim. Findings – We argue that Mann and Thompson’s (1987, 1988) Rhetorical Structure Theory (RST) offers valuable insights into computer-mediated health communication and deserves further discussion of its methodological strength and weaknesses for application in LIS. Research limitations/implications – While we agree that some methodological limitations pointed out by Urquhart and Urquhart are valid, we take this opportunity to correct certain misunderstandings and misstatements. Originality/value – We argue for continued use of innovative techniques borrowed from neighboring disciplines, in spite of objections from the researchers accustomed to a familiar strand of literature. We encourage researchers to consider RST and other computational linguistics-based discourse analysis annotation frameworks that could provide the basis for integrated research, and eventual applications in information behaviour and information retrieval

    Veracity Roadmap: Is Big Data Objective, Truthful and Credible?

    Full text link
    • 

    corecore